Conversation
184531e to
eeaf668
Compare
* Stop adding to-one-joined entity keys to query identifiers * Prune unneeded to-one JOINs Closes dotnet#29182
eeaf668 to
4a3ef03
Compare
There was a problem hiding this comment.
Pull request overview
This PR improves query SQL generation by avoiding redundant identifier expansion and enabling pruning of unnecessary to-one reference joins, especially benefiting split queries (issue #29182).
Changes:
- Treat single-result/to-one joins as not increasing cardinality, so inner identifiers aren’t added to the outer query identifier.
- Detect to-one joins via join predicates and mark corresponding LEFT JOINs as prunable.
- Update many SQL assertion baselines across SqlServer/Sqlite tests to reflect fewer JOINs, projections, and ORDER BY columns.
Reviewed changes
Copilot reviewed 60 out of 60 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/EFCore.Relational/Query/SqlExpressions/SelectExpression.cs | Adds to-one join awareness to identifier propagation and marks certain LEFT JOINs as prunable; introduces predicate-based to-one detection. |
| test/EFCore.Sqlite.FunctionalTests/BulkUpdates/NorthwindBulkUpdatesSqliteTest.cs | Updates SQL baselines to remove now-pruned LEFT JOIN subqueries/parameters. |
| test/EFCore.Sqlite.FunctionalTests/BulkUpdates/NonSharedModelBulkUpdatesSqliteTest.cs | Updates SQL baseline to remove redundant LEFT JOIN. |
| test/EFCore.SqlServer.FunctionalTests/BulkUpdates/NorthwindBulkUpdatesSqlServerTest.cs | Updates SQL baselines to remove now-pruned LEFT JOIN subqueries/parameters. |
| test/EFCore.SqlServer.FunctionalTests/BulkUpdates/NonSharedModelBulkUpdatesSqlServerTest.cs | Updates SQL baseline to remove redundant LEFT JOIN. |
| test/EFCore.SqlServer.FunctionalTests/Query/*.cs (multiple) | Updates many SQL baselines (fewer projected key columns, fewer JOINs, simplified ORDER BY). |
|
@roji Firstly thanks for tackling this problem which seems to improve a lot of queries! I have potentially found an issue when a one-to-one is involved. Firstly, depending on which side is declared as the principal and which one you start with as the root of your query, you may either get an |
|
I put the two tests collapsed at the bottom of the post in
-- SplitTestContextA
SELECT [b].[Id], [b].[AuthorId], [a].[Id]
FROM [Blogs] AS [b]
INNER JOIN [Author] AS [a] ON [b].[AuthorId] = [a].[Id]
ORDER BY [b].[Id]
-- SplitTestContextA
SELECT [p].[Id], [p].[BlogId], [b].[Id]
FROM [Blogs] AS [b]
INNER JOIN [Author] AS [a] ON [b].[AuthorId] = [a].[Id]
INNER JOIN [Post] AS [p] ON [b].[Id] = [p].[BlogId]
ORDER BY [b].[Id]The inner join to Author is not pruned. I don't think it's needed, and it's 'fixed' by also passing
-- SplitTestContextB
SELECT [b].[Id], [a].[Id], [a].[BlogId]
FROM [Blogs] AS [b]
LEFT JOIN [Author] AS [a] ON [b].[Id] = [a].[BlogId]
ORDER BY [b].[Id], [a].[Id]
-- SplitTestContextB
SELECT [p].[Id], [p].[BlogId], [b].[Id], [a].[Id]
FROM [Blogs] AS [b]
LEFT JOIN [Author] AS [a] ON [b].[Id] = [a].[BlogId]
INNER JOIN [Post] AS [p] ON [b].[Id] = [p].[BlogId]
ORDER BY [b].[Id], [a].[Id]Here, the left join to Author is not pruned, and I think it's related to the identifier matching process not expecting the one-to-one to be sided this way. Obviously, if it's simply that this enhancement doesn't or can't cover one-to-one cases, that's fine and it's still an improvement. I've not found any actual breakages yet. Details [ConditionalFact]
public async Task SplitTestA()
{
var contextFactory = await InitializeAsync<SplitTestContextA>(seed: a => a.SeedAsync());
using var context = contextFactory.CreateContext();
_ = context
.Set<SplitTestContextA.Blog>()
.Include(x => x.Author)
.Include(x => x.Posts)
.AsSplitQuery()
.TagWith(nameof(SplitTestContextA))
.ToList();
var sql = TestSqlLoggerFactory.Sql;
}
[ConditionalFact]
public async Task SplitTestB()
{
var contextFactory = await InitializeAsync<SplitTestContextB>(seed: a => a.SeedAsync());
using var context = contextFactory.CreateContext();
_ = context
.Set<SplitTestContextB.Blog>()
.Include(x => x.Author)
.Include(x => x.Posts)
.AsSplitQuery()
.TagWith(nameof(SplitTestContextB))
.ToList();
var sql = TestSqlLoggerFactory.Sql;
}
protected class SplitTestContextA(DbContextOptions options) : DbContext(options)
{
public DbSet<Blog> Blogs { get; set; }
public class Blog
{
public int Id { get; set; }
public int AuthorId { get; set; }
public Author Author { get; set; }
public ICollection<Post> Posts { get; set; }
}
public class Author
{
public int Id { get; set; }
public Blog Blog { get; set; }
}
public class Post
{
public int Id { get; set; }
public Blog Blog { get; set; }
}
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<Blog>()
.HasOne(b => b.Author)
.WithOne(a => a.Blog)
.HasForeignKey<Blog>(a => a.AuthorId);
}
public async Task SeedAsync()
{
Add(new Blog
{
Author = new Author(),
Posts = [new Post()]
});
await SaveChangesAsync();
}
}
protected class SplitTestContextB(DbContextOptions options) : DbContext(options)
{
public DbSet<Blog> Blogs { get; set; }
public class Blog
{
public int Id { get; set; }
public Author Author { get; set; }
public ICollection<Post> Posts { get; set; }
}
public class Author
{
public int Id { get; set; }
public int BlogId { get; set; }
public Blog Blog { get; set; }
}
public class Post
{
public int Id { get; set; }
public Blog Blog { get; set; }
}
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<Blog>()
.HasOne(b => b.Author)
.WithOne(a => a.Blog)
.HasForeignKey<Author>(a => a.BlogId);
}
public async Task SeedAsync()
{
Add(new Blog
{
Author = new Author(),
Posts = [new Post()]
});
await SaveChangesAsync();
}
} |
|
@stevendarby thanks for all of the above, and for checking the change - much appreciated! Yes, you're right about INNER JOIN not getting pruned.
Next steps:
How does that all sound? PS even though this PR doesn't remove the INNER JOIN, it does remove the identifiers from the ORDER BY (and also from the projection); that's also a significant improvement, even if not the full optimization as for non-required navigations. |
|
@roji thanks for the explanation, that all sounds good to me. I could have missed it but I don't think you covered why the left join isn't pruned from |
| @@ -634,7 +634,7 @@ LEFT JOIN ( | |||
| FROM [EntityCompositeKeyEntityTwo] FOR SYSTEM_TIME AS OF '2010-01-01T00:00:00.0000000' AS [e3] | |||
| INNER JOIN [EntityCompositeKeys] FOR SYSTEM_TIME AS OF '2010-01-01T00:00:00.0000000' AS [e4] ON [e3].[CompositeKeySkipSharedKey1] = [e4].[Key1] AND [e3].[CompositeKeySkipSharedKey2] = [e4].[Key2] AND [e3].[CompositeKeySkipSharedKey3] = [e4].[Key3] | |||
| ) AS [s1] ON [e].[Id] = [s1].[TwoSkipSharedId] | |||
| ORDER BY [e].[Id], [s].[ThreeId], [s].[TwoId], [s].[Id], [s0].[SelfSkipSharedLeftId], [s0].[SelfSkipSharedRightId], [s0].[Id], [s1].[TwoSkipSharedId], [s1].[CompositeKeySkipSharedKey1], [s1].[CompositeKeySkipSharedKey2], [s1].[CompositeKeySkipSharedKey3], [s1].[Key1], [s1].[Key2] | |||
| ORDER BY [e].[Id], [s].[ThreeId], [s].[TwoId], [s0].[SelfSkipSharedLeftId], [s0].[SelfSkipSharedRightId], [s1].[TwoSkipSharedId], [s1].[CompositeKeySkipSharedKey1], [s1].[CompositeKeySkipSharedKey2] | |||
There was a problem hiding this comment.
I hoped to get my head around this more before commenting, but may not have time. Bit unsure of a few cases like this where keys are partially pruned from the order by, e.g. why just CompositeKeySkipSharedKey3 pruned? May be fine! In mind is a possible interaction with that mechanism that removes the last column from the order by when collections joins are involved.
Closes #29182
Closes #29662